LIP: Local Importance-Based Pooling
نویسندگان
چکیده
Spatial downsampling layers are favored in convolutional neural networks (CNNs) to downscale feature maps for larger receptive fields and less memory consumption. However, visual recognition tasks, these might lose discriminative details due improper pooling strategies. In this paper, we present a unified framework (LAN) over the common (e.g., average pooling, max strided convolution) from view of local aggregation based on importance. LAN framework, analyze issues widely-used figure out criteria designing an effective layer. Based analysis, propose simple, general, operation importance modeling, termed as Local Importance-based Pooling (LIP). LIP is able enhance features during procedure by learning adaptive weights inputs. To further modulate different windows more improved version LIP, LIP++, introducing explicit margin term efficient logit modules. Our LIP++ can yield consistent accuracy improvement original yet with smaller computational cost. Extensive experiments show that our presented method consistently yields notable gains CNN architectures image classification task. challenging MS COCO dataset, detectors LIP-ResNets backbones obtain performance vanilla ResNets both bounding box detection instance segmentation. Finally, also verify effectiveness tasks pose estimation semantic segmentation, demonstrating its generalization dense prediction
منابع مشابه
A local region based approach to lip tracking
Lip tracking has played a significant role in a lip reading system. In this paper, we present a local region based approach to lip tracking, which consists of two phases: (i) lip contour extraction for the first lip frame, and followed by (ii) lip tracking in the subsequent lip frames. Initially, we construct a localized color active color model provided that the foreground and background regio...
متن کاملProbabilistic opinion pooling generalized Part two: Premise-based opinion pooling
We consider the classical problem of aggregating di¤erent individualsprobability assignments (opinions) over a -algebra of events. In practice, some events represent basic propositions, such as it rainsor CO2 emissions cause global warming, while others represent combinations thereof, for instance disjunctions (unions) of basic events. It is plausible to treat the basic events as premises ...
متن کاملSupplementary material of the CVPR’17 Viraliency: Pooling Local Virality
We implemented our LENA pooling layer within the Caffe framework and ran all our experiments using a Tesla K40 GPU. All the networks were fine-tuned from the convolutional filters obtained when training these networks for the 1,000 image classification task on the ImageNet dataset. We iterated the stochastic gradient descent algorithm for 10,000 iterations with a momentum of μ = 0.9 and a weigh...
متن کاملLocal processes and spatial pooling in texture and symmetry detection
We examined the ability of human observers to detect three kinds of statistical structure in binary arrays: first-order statistics (luminance), local fourth-order statistics (isodipole textures), and long-range statistics (bilateral symmetry). Performance was closest to ideal on the luminance task and furthest from ideal on the symmetry task. For each kind of statistic, the dependence of perfor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer Vision
سال: 2022
ISSN: ['0920-5691', '1573-1405']
DOI: https://doi.org/10.1007/s11263-022-01707-4